Unification of XML Document Structures for Document Warehouse (DocW)

نویسندگان

  • Ines Ben Messaoud
  • Jamel Feki
  • Kaïs Khrouf
  • Gilles Zurfluh
چکیده

Data warehouses and OLAP (On Line Analytical Processing) technologies analyse huge amounts of structured data that companies store as conventional databases. Recent works underline the importance of textual data for the decision making process and, therefore, lead to build document warehouses. In fact, documents help decision makers to better understand the evolution of their business activities. In general, these documents exist in XML format, are geographically distributed and described by multiple and different structures. This paper deals with a method to build a distributed document warehouse. This method consists of two steps: i) unification of XML document structures in order to set a global and generic perception/view of the distributed document warehouse, and ii) multidimensional modeling of unified documents for decisional purposes. More specifically, this paper focuses on the unification step.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

C-warehousing: a Hl7 Cda-based Approach for the Secondary Use of Clinical Data

This paper proposes a semi-automatic approach to extract information stored in a HL7 Clinical Document Architecture (CDA) and transform them to be loaded in a Data Warehouse for secondary purposes. It represents a suitable solution to facilitate the design and implementation of Extract, Transform and Load (ETL) tools that are considered the most time-consuming step of the data warehouse develop...

متن کامل

Conceptual Design of XML Document Warehouses

EXtensible Markup Language (XML) has emerged as the dominant standard in describing and exchanging data among heterogeneous data sources. XML with its self-describing hierarchical structure and its associated XML Schema (XSD) provides the flexibility and the manipulative power needed to accommodate complex, disconnected, heterogeneous data. The issue of large volume of data appearing deserves i...

متن کامل

Conversion of XML Schema to Data Warehouse Schema using Automatic Approach

eXtensible Markup Language (XML) is data exchange format for representation data in Web based system. XML is used by many organizations for e-commerce and internet based applications such as online shopping, digital library, and electronic devices and so on. XML data is not sufficient to analyze on the Web. So XML is required to systematically analyze by industrial organizations to enable enhan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011